QnR-Learning in Markov Games
نویسندگان
چکیده
منابع مشابه
Utilizing Generalized Learning Automata for Finding Optimal Policies in MMDPs
Multi agent Markov decision processes (MMDPs), as the generalization of Markov decision processes to the multi agent case, have long been used for modeling multi agent system and are used as a suitable framework for Multi agent Reinforcement Learning. In this paper, a generalized learning automata based algorithm for finding optimal policies in MMDP is proposed. In the proposed algorithm, MMDP ...
متن کاملMultiagent Reinforcement Learning in Stochastic Games
We adopt stochastic games as a general framework for dynamic noncooperative systems. This framework provides a way of describing the dynamic interactions of agents in terms of individuals' Markov decision processes. By studying this framework, we go beyond the common practice in the study of learning in games, which primarily focus on repeated games or extensive-form games. For stochastic games...
متن کاملValue Function Approximation in Zero-Sum Markov Games
This paper investigates value function approximation in the context of zero-sum Markov games, which can be viewed as a generalization of the Markov decision process (MDP) framework to the two-agent case. We generalize error bounds from MDPs to Markov games and describe generalizations of reinforcement learning algorithms to Markov games. We present a generalization of the optimal stopping probl...
متن کاملCorrelated Q-Learning
Recently, there have been several attempts to design multiagent learning algorithms that learn equilibrium policies in general-sum Markov games, just as Q-learning learns optimal policies in Markov decision processes. This paper introduces correlated-Q learning, one such algorithm. The contributions of this paper are twofold: (i) We show empirically that correlated-Q learns correlated equilibri...
متن کاملLearning in Markov Games with Incomplete Information
The Markov game (also called stochastic game (Filar & Vrieze 1997)) has been adopted as a theoretical framework for multiagent reinforcement learning (Littman 1994). In a Markov game, there are n agents, each facing a Markov decision process (MDP). All agents’ MDPs are correlated through their reward functions and the state transition function. As Markov decision process provides a theoretical ...
متن کامل